feat(llmo): CloudFront CDN log delivery with assume-role setup (LLMO-5566)#2680
feat(llmo): CloudFront CDN log delivery with assume-role setup (LLMO-5566)#2680claudiaboldis wants to merge 6 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
This PR will trigger a minor release when merged. |
…d, CF templates, setupType, rescan endpoint
- gateEdgeOptimizeWizard now derives externalId = toSafeAwsName(imsOrgId) server-side; FE no longer sends externalId
- getEdgeOptimizeBootstrapUrl accepts setupType ('log-only'|'log-and-oae') and selects the matching CF template key
- Add customer-bootstrap-role.yaml and customer-bootstrap-role-log-only.yaml CloudFormation templates
- Add POST /sites/:siteId/llmo/cdn-log-rescan (rescanCdnLogDelivery) for idempotent re-enabling of log delivery
- Remove 14 stale 'externalId missing' tests; add 9 tests for rescanCdnLogDelivery
Introduced by: #2680
…MO-5566) - Remove dead !Organization early-return from gateEdgeOptimizeWizard - Add test for custom EDGE_OPTIMIZE_ROLE_NAME env var path - Add test for rejection with no .message (covers 'unknown error' fallback) Introduced by: #2680
…strapUrl (LLMO-5566) - Add test for log-only setupType without EDGE_OPTIMIZE_TEMPLATE_KEY_LOG_ONLY (default fallback) - Add test for missing IMS org ID in the bootstrap URL inline org check Introduced by: #2680
main #2682 shipped the CloudFront onboarding wizard (LlmoCloudFrontController, cdn-onboard/cloudfront/*) while this branch was open, making the branch's parallel edge-optimize/* wizard redundant. Reset onto main and rebuilt around the only net-new slice: CloudWatch log delivery. - Add enableCdnLogDelivery + rescanCdnLogDelivery on LlmoCloudFrontController under cdn-onboard/cloudfront/log-delivery and .../log-rescan - Assume-role externalId uses main's client-supplied per-session UUID (validateCloudfrontCredentials); imsOrgId for the org-scoped delivery destination is resolved server-side from the site's organization - Reuse tokowaka CloudFrontEdgeClient.listDistributions for the rescan (no local edge-optimize support module needed) - Add src/support/cdn-log-delivery.js (CloudWatch Logs delivery control plane) and @aws-sdk/client-cloudwatch-logs; OpenAPI + capability/route entries; tests Introduced by: #2680
f3bbb27 to
8cc66b0
Compare
MysticatBot
left a comment
There was a problem hiding this comment.
Hey @claudiaboldis,
Verdict: Request changes - two issues to address before merge (concurrency cap and HTTP status correctness).
Complexity: HIGH - large diff; API surface + new dependency.
Changes: Adds CDN log delivery endpoints (single-distribution and bulk rescan) using STS AssumeRole to create CloudWatch Logs delivery sources in customer accounts (13 files).
Note: Recommend a human read before merge - this change modifies the OpenAPI contract (docs/openapi/). The bot review is a complement to, not a replacement for, a human read here.
Must fix before merge
- [Important] Unbounded concurrency in
rescanCdnLogDeliverywill throttle on large accounts -src/controllers/llmo/llmo-cloudfront.js:1228.Promise.allSettledfirescreateCdnLogDeliveryfor ALL distributions in parallel with no concurrency cap. Each call issues 1-3 CloudWatch Logs API calls. A customer with 50+ distributions will hit CloudWatch Logs per-account throttling (10-25 TPS), causing most calls to fail withThrottlingException. Fix: add a concurrency limiter (e.g.p-limit(5)) or batch distributions in small groups. - [Important] Missing server config returns 400 (client error) instead of 500 -
src/controllers/llmo/llmo-cloudfront.js:1164. WhenCDN_LOG_DELIVERY_DEST_ACCOUNT_IDis not set,badRequest(...)tells the caller their input is wrong when the server is misconfigured. Fix: returninternalServerError(...)instead (same applies inrescanCdnLogDeliveryat line ~1218).
Non-blocking (3): minor issues and suggestions
- nit:
docs/LLMO-5566/plan.mdreferencessrc/controllers/llmo/llmo.jsand 16 endpoints, but the actual controller isllmo-cloudfront.jsand this PR delivers 2 endpoints - the plan is misleading from day one - suggestion: Both new handlers diverge from the existing
mutationErrorResponsepattern used by sibling endpoints in this controller (which surfaces actionable AWS errors likeAccessDeniedExceptionandThrottlingExceptionas 400s instead of generic 500) - consider usingmutationErrorResponsein the outer catch for consistency -src/controllers/llmo/llmo-cloudfront.js:1270 - suggestion:
rescanCdnLogDeliveryresponse includes rawoutcome.reason?.messageper distribution, which can expose internal AWS error details (ARNs, role names) to the caller - consider sanitizing to error category only -src/controllers/llmo/llmo-cloudfront.js:1244
|
Mysticat review failed: Claude CLI crashed (exit 1): stderr= stdout={"type":"result","subtype":"error_max_budget_usd","duration_ms":628407,"duration_api_ms":1527224,"is_error":true,"num_turns":38,"stop_reason":"end_turn","session_id":"7e869476-18cc-497e-b372-17f8889bac58","total_cost_usd":10.07213945,"usage":{"input_tokens":19,"cache_creation_input_tokens":224486,"c |
- rescanCdnLogDelivery: cap parallel createCdnLogDelivery calls in batches of 5 to avoid CloudWatch Logs per-account throttling on large accounts - missing CDN_LOG_DELIVERY_DEST_ACCOUNT_ID now returns 500 (server misconfig), not 400 - outer catch uses mutationErrorResponse so AWS AccessDenied/Throttling surface as 4xx, consistent with sibling CloudFront endpoints - per-distribution rescan errors report the AWS error category (name) only, never the raw message (avoids leaking ARNs/role names) - rewrite docs/LLMO-5566/plan.md to reflect the re-homed log-delivery design Introduced by: #2680
There was a problem hiding this comment.
Hey @claudiaboldis,
Verdict: Request changes - two issues to address (concurrency race and missing integration tests).
Complexity: HIGH - large diff; API surface + new dependency + OpenAPI contract change.
Changes: Adds CDN log delivery endpoints (single-distribution and bulk rescan) using STS AssumeRole to create CloudWatch Logs delivery sources in customer accounts, with bounded concurrency and proper error categorization (13 files).
Note: Recommend a human read before merge - this change modifies the OpenAPI contract (docs/openapi/). The bot review is a complement to, not a replacement for, a human read here.
Must fix before merge
- [Important] Unhandled
ConflictExceptionon concurrentCreateDeliverycalls -src/support/cdn-log-delivery.js:167(details inline) - [Important] Missing integration tests for both new endpoints - CLAUDE.md rule: "New or modified endpoints must include integration tests in
test/it/" (details inline)
Non-blocking (2): minor issues and suggestions
- suggestion: Delivery source name (
llmo-cf-{safeOrgId}-{distributionId}) may exceed the CloudWatch Logs 60-character delivery-source name limit for long IMS org IDs - consider a defensive length check or hash-based truncation inbuildDeliverySourceName-src/support/cdn-log-delivery.js:89 - suggestion:
createCdnLogDeliverycreates a newCloudWatchLogsClientper invocation - during a rescan with 50+ distributions this adds unnecessary object churn; consider accepting an optional pre-built client or lifting client creation to the caller -src/support/cdn-log-delivery.js:135
Previously flagged, now resolved
- Unbounded concurrency in
rescanCdnLogDeliverynow capped atCDN_LOG_RESCAN_CONCURRENCY = 5with a batched for-loop - Missing server config (
CDN_LOG_DELIVERY_DEST_ACCOUNT_ID) now correctly returnsinternalServerError()instead ofbadRequest()
Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 2m 27s | Cost: $8.49 | Commit: a2e7638ffd078cd5730fc1711ba46b55252fbfb0
If this code review was useful, please react with 👍. Otherwise, react with 👎.
| } | ||
| nextToken = page.nextToken; | ||
| } while (nextToken); | ||
| } |
There was a problem hiding this comment.
issue (blocking): The idempotency check has a TOCTOU window. If enableCdnLogDelivery and rescanCdnLogDelivery run concurrently for the same distribution, both can pass the GetDeliverySource existence check, both succeed at PutDeliverySource (idempotent PUT), and then the second CreateDelivery call hits a ConflictException that is unhandled - it bubbles to the outer catch and returns a 500 to the caller.
Fix: catch ConflictException (or ResourceAlreadyExistsException) from CreateDelivery and treat it as { created: false, alreadyExisted: true }, mirroring the existing idempotency contract. This is the minimal change - the check-then-act pattern stays, but the race outcome is no longer an error.
Alternatively, if concurrent use is not expected for this admin tooling, document the mutual-exclusion constraint in the plan or a code comment.
| @@ -2287,4 +2307,321 @@ describe('LlmoCloudFrontController', () => { | |||
| expect(result.status).to.equal(403); | |||
There was a problem hiding this comment.
issue (blocking): CLAUDE.md rule: "New or modified endpoints must include integration tests in test/it/ - add shared test logic in shared/tests/, seed data in postgres/seed-data/, and a wiring file in postgres/."
Both enableCdnLogDelivery and rescanCdnLogDelivery are new endpoints with unit tests only. Integration tests that exercise the full request lifecycle (auth gating, route resolution, response shape) through the PostgreSQL harness are required by the documented convention.
Note: these endpoints call external AWS APIs via assumed roles, so the IT scope would focus on the auth/routing/validation layers (mocking the AWS calls at the boundary), not the CloudWatch Logs interactions themselves.
Addresses MysticatBot review on PR #2680: - cdn-log-delivery: catch ConflictException/ResourceAlreadyExistsException from CreateDelivery (TOCTOU between concurrent enable/rescan for the same distribution) and return { created:false, alreadyExisted:true } with the winning delivery id, preserving idempotency instead of surfacing a 500 - add integration tests (test/it/postgres + shared) for the log-delivery and log-rescan endpoints covering the LLMO-admin access gate (404/403) and request-body validation (400), mirroring the llmo-cloudflare-onboarding IT pattern Introduced by: #2680
…LMO-5566) Close the codecov/patch gap from the TOCTOU fix: add tests for - CreateDelivery rejecting with a non-conflict error (rethrow path) - conflict where the raced delivery can't be found (findExistingDelivery returns undefined) Introduced by: #2680
There was a problem hiding this comment.
Hey @claudiaboldis,
Verdict: Approve - prior blocking findings are addressed; no new blocking issues.
Complexity: HIGH - large diff; API surface + new dependency.
Changes: Adds CDN log delivery endpoints (single-distribution and bulk rescan) using STS AssumeRole with ConflictException race handling and bounded concurrency (15 files).
Note: Recommend a human read before merge - this change modifies the OpenAPI contract (docs/openapi/). The bot review is a complement to, not a replacement for, a human read here.
Non-blocking (2): minor issues and suggestions
- suggestion:
buildDeliverySourceNamecan produce names exceeding the CloudWatch Logs 60-character delivery-source name limit for long IMS org IDs - consider a defensive length check or hash-based truncation -src/support/cdn-log-delivery.js:89 - suggestion: After a
ConflictExceptionrace increateCdnLogDelivery,findExistingDelivery()could return undefined (eventual consistency), resulting indeliveryId: nullin the response - consider a single retry or an explicit note in the response when the ID cannot be resolved -src/support/cdn-log-delivery.js:199
Previously flagged, now resolved
- Unhandled ConflictException on concurrent CreateDelivery calls now caught with idempotent fallback
- Missing integration tests now added (access-control and validation layers covered in test/it/)
- Unbounded concurrency in rescanCdnLogDelivery now capped at CDN_LOG_RESCAN_CONCURRENCY = 5
- Missing server config now correctly returns internalServerError() instead of badRequest()
Skill: pr-review | Model: us.anthropic.claude-opus-4-6-v1[1m] | Duration: 2m 11s | Cost: $6.05 | Commit: 6d39666c21acedd9e4d9ab324cd107ebe3afb640
If this code review was useful, please react with 👍. Otherwise, react with 👎.
…-5566) Address MysticatBot non-blocking suggestions on PR #2680: - buildDeliverySourceName now hash-truncates names over 60 chars (CloudWatch Logs limit) using a deterministic sha256 suffix, so long IMS org / resource ids stay valid and the name remains stable across calls (idempotency depends on it) - document why deliveryId may be undefined after a ConflictException race (eventual consistency; delivery is enabled regardless, so no retry loop) Introduced by: #2680
Summary
AssumeRole(connector role) so the LLMO UI CloudFront wizard never needs AWS credentials in the browser — all CloudFront and CloudWatch Logs calls are performed by the Lambdaconnect,distributions,prerequisites,origins,behaviors) that power the step-by-step CloudFront BYOCDN setup flowPOST /sites/:siteId/llmo/cdn-log-delivery(diagram step 8) — idempotently creates a CloudWatch Logs delivery source in the customer's account and links it to Adobe's cross-account delivery destination, enabling CloudFront standard log push to the cdn-logs S3 bucketNew files
src/support/edge-optimize.jsassumeConnectorRole(STS),listCloudFrontDistributions,getDistributionConfig, idempotent step-on-poll deploy orchestratorsrc/support/cdn-log-delivery.jslogs:PutDeliverySource+logs:CreateDeliveryvia the assumed roleEndpoints (all
POST, allINTERNAL_ROUTES— admin/IMS-only)POST /sites/:siteId/llmo/edge-optimize/connectPOST /sites/:siteId/llmo/edge-optimize/distributionsPOST /sites/:siteId/llmo/edge-optimize/prerequisitesPOST /sites/:siteId/llmo/edge-optimize/originsPOST /sites/:siteId/llmo/edge-optimize/behaviorsPOST /sites/:siteId/llmo/cdn-log-deliverycustomer-bootstrap-role.yamlinEDGE_OPTIMIZE_TEMPLATE_BUCKET) must addlogs:PutDeliverySource,logs:CreateDelivery,logs:GetDeliverySource,logs:DescribeDeliveries— today the role only grants CloudFront permscdn-logs-<org>delivery destination + cross-account policy for the BYOCDN CloudFront path (the handler surfaces a clear "destination not provisioned" 400 if it's missing)Test plan
Related
🤖 Generated with Claude Code